1,701 research outputs found

    Distinguishing Computer-generated Graphics from Natural Images Based on Sensor Pattern Noise and Deep Learning

    Full text link
    Computer-generated graphics (CGs) are images generated by computer software. The~rapid development of computer graphics technologies has made it easier to generate photorealistic computer graphics, and these graphics are quite difficult to distinguish from natural images (NIs) with the naked eye. In this paper, we propose a method based on sensor pattern noise (SPN) and deep learning to distinguish CGs from NIs. Before being fed into our convolutional neural network (CNN)-based model, these images---CGs and NIs---are clipped into image patches. Furthermore, three high-pass filters (HPFs) are used to remove low-frequency signals, which represent the image content. These filters are also used to reveal the residual signal as well as SPN introduced by the digital camera device. Different from the traditional methods of distinguishing CGs from NIs, the proposed method utilizes a five-layer CNN to classify the input image patches. Based on the classification results of the image patches, we deploy a majority vote scheme to obtain the classification results for the full-size images. The~experiments have demonstrated that (1) the proposed method with three HPFs can achieve better results than that with only one HPF or no HPF and that (2) the proposed method with three HPFs achieves 100\% accuracy, although the NIs undergo a JPEG compression with a quality factor of 75.Comment: This paper has been published by Sensors. doi:10.3390/s18041296; Sensors 2018, 18(4), 129

    Project RISE: Recognizing Industrial Smoke Emissions

    Full text link
    Industrial smoke emissions pose a significant concern to human health. Prior works have shown that using Computer Vision (CV) techniques to identify smoke as visual evidence can influence the attitude of regulators and empower citizens to pursue environmental justice. However, existing datasets are not of sufficient quality nor quantity to train the robust CV models needed to support air quality advocacy. We introduce RISE, the first large-scale video dataset for Recognizing Industrial Smoke Emissions. We adopted a citizen science approach to collaborate with local community members to annotate whether a video clip has smoke emissions. Our dataset contains 12,567 clips from 19 distinct views from cameras that monitored three industrial facilities. These daytime clips span 30 days over two years, including all four seasons. We ran experiments using deep neural networks to establish a strong performance baseline and reveal smoke recognition challenges. Our survey study discussed community feedback, and our data analysis displayed opportunities for integrating citizen scientists and crowd workers into the application of Artificial Intelligence for social good.Comment: Technical repor

    Scale Attention for Learning Deep Face Representation: A Study Against Visual Scale Variation

    Full text link
    Human face images usually appear with wide range of visual scales. The existing face representations pursue the bandwidth of handling scale variation via multi-scale scheme that assembles a finite series of predefined scales. Such multi-shot scheme brings inference burden, and the predefined scales inevitably have gap from real data. Instead, learning scale parameters from data, and using them for one-shot feature inference, is a decent solution. To this end, we reform the conv layer by resorting to the scale-space theory, and achieve two-fold facilities: 1) the conv layer learns a set of scales from real data distribution, each of which is fulfilled by a conv kernel; 2) the layer automatically highlights the feature at the proper channel and location corresponding to the input pattern scale and its presence. Then, we accomplish the hierarchical scale attention by stacking the reformed layers, building a novel style named SCale AttentioN Conv Neural Network (\textbf{SCAN-CNN}). We apply SCAN-CNN to the face recognition task and push the frontier of SOTA performance. The accuracy gain is more evident when the face images are blurry. Meanwhile, as a single-shot scheme, the inference is more efficient than multi-shot fusion. A set of tools are made to ensure the fast training of SCAN-CNN and zero increase of inference cost compared with the plain CNN

    Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis

    Full text link
    Adapting generic speech recognition models to specific individuals is a challenging problem due to the scarcity of personalized data. Recent works have proposed boosting the amount of training data using personalized text-to-speech synthesis. Here, we ask two fundamental questions about this strategy: when is synthetic data effective for personalization, and why is it effective in those cases? To address the first question, we adapt a state-of-the-art automatic speech recognition (ASR) model to target speakers from four benchmark datasets representative of different speaker types. We show that ASR personalization with synthetic data is effective in all cases, but particularly when (i) the target speaker is underrepresented in the global data, and (ii) the capacity of the global model is limited. To address the second question of why personalized synthetic data is effective, we use controllable speech synthesis to generate speech with varied styles and content. Surprisingly, we find that the text content of the synthetic data, rather than style, is important for speaker adaptation. These results lead us to propose a data selection strategy for ASR personalization based on speech content.Comment: ICASSP 202

    Deep Time-Stream Framework for Click-Through Rate Prediction by Tracking Interest Evolution

    Full text link
    Click-through rate (CTR) prediction is an essential task in industrial applications such as video recommendation. Recently, deep learning models have been proposed to learn the representation of users' overall interests, while ignoring the fact that interests may dynamically change over time. We argue that it is necessary to consider the continuous-time information in CTR models to track user interest trend from rich historical behaviors. In this paper, we propose a novel Deep Time-Stream framework (DTS) which introduces the time information by an ordinary differential equations (ODE). DTS continuously models the evolution of interests using a neural network, and thus is able to tackle the challenge of dynamically representing users' interests based on their historical behaviors. In addition, our framework can be seamlessly applied to any existing deep CTR models by leveraging the additional Time-Stream Module, while no changes are made to the original CTR models. Experiments on public dataset as well as real industry dataset with billions of samples demonstrate the effectiveness of proposed approaches, which achieve superior performance compared with existing methods.Comment: 8 pages. arXiv admin note: text overlap with arXiv:1809.03672 by other author

    Critical Roles of microRNA-141-3p and CHD8 in Hypoxia/Reoxygenation-Induced Cardiomyocyte Apoptosis

    Get PDF
    Background: Cardiovascular diseases are currently the leading cause of death in humans. The high mortality of cardiac diseases is associated with myocardial ischemia and reperfusion (I/R). Recent studies have reported that microRNAs (miRNAs) play important roles in cell apoptosis. However, it is not known yet whether miR-141-3p contributes to the regulation of cardiomyocyte apoptosis. It has been well established that in vitro hypoxia/reoxygenation (H/R) model can follow in vivo myocardial I/R injury. This study aimed to investigate the effects of miR-141-3p and CHD8 on cardiomyocyte apoptosis following H/R. Results: We found that H/R remarkably reduces the expression of miR-141-3p but enhances CHD8 expression both in mRNA and protein in H9c2 cardiomyocytes. We also found either overexpression of miR-141-3p by transfection of miR-141-3p mimics or inhibition of CHD8 by transfection of small interfering RNA (siRNA) significantly decrease cardiomyocyte apoptosis induced by H/R. Moreover, miR-141-3p interacts with CHD8. Furthermore, miR-141-3p and CHD8 reduce the expression of p21. Conclusion: MiR-141-3p and CHD8 play critical roles in cardiomyocyte apoptosis induced by H/R. These studies suggest that miR-141-3p and CHD8 mediated cardiomyocyte apoptosis may offer a novel therapeutic strategy against myocardial I/R injury-induced cardiovascular diseases

    Magnetic dilution effect and topological phase transitions in (Mn1−x_{1-x}Pbx_x)Bi2_2Te4_4

    Full text link
    As the first intrinsic antiferromagnetic (AFM) topological insulator (TI), MnBi2_2Te4_4 has provided a material platform to realize various emergent phenomena arising from the interplay of magnetism and band topology. Here by investigating (Mn1−x_{1-x}Pbx_x)Bi2_2Te4_4 (0≤x≤0.82)(0\leq x \leq 0.82) single crystals via the x-ray, electrical transport, magnetometry and neutron measurements, chemical analysis, external pressure, and first-principles calculations, we reveal the magnetic dilution effect on the magnetism and band topology in MnBi2_2Te4_4. With increasing xx, both lattice parameters aa and cc expand linearly by around 2\%. All samples undergo the paramagnetic to A-type antiferromagnetic transition with the Neˊ\acute{e}el temperature decreasing lineally from 24 K at x=0x=0 to 2 K at x=0.82x=0.82. Our neutron data refinement of the x=0.37x=0.37 sample indicates that the ordered moment is 4.3(1)μB\mu_B/Mn at 4.85 K and the amount of the MnBi_{\rm{Bi}} antisites is negligible within the error bars. Isothermal magnetization data reveal a slight decrease of the interlayer plane-plane antiferromagnetic exchange interaction and a monotonic decrease of the magnetic anisotropy, due to diluting magnetic ions and enlarging the unit cell. For x=0.37x=0.37, the application of external pressures enhances the interlayer antiferromagnetic coupling, boosting the Neˊ\acute{e}el temperature at a rate of 1.4 K/GPa and the saturation field at a rate of 1.8 T/GPa. Furthermore, our first-principles calculations reveal that the band inversion in the two end materials, MnBi2_2Te4_4 and PbBi2_2Te4_4, occurs at the Γ\Gamma and ZZ point, respectively, while two gapless points appear at x=x = 0.44 and x=x = 0.66, suggesting possible topological phase transitions with doping.Comment: 10 pages, 7 figure
    • …
    corecore